ECE 257A: Fault-Tolerant Computing
نویسنده
چکیده
Course: ECE 257A – Fault-Tolerant Computing, University of California, Santa Barbara, Fall 2006, Enrollment Code 49585 Catalog entry: 257A. Fault-Tolerant Computing. (4) PARHAMI. Prerequisite: ECE 154. Lecture, 4 hours. Basic concepts of dependable computing. Reliability of nonredundant and redundant systems. Dealing with circuit-level defects. Logic-level fault testing and tolerance. Error detection and correction. Diagnosis and reconfiguration for system-level malfunctions. Degradation management. Failure modeling and risk assessment. (F) Instructor: Behrooz Parhami, Room 5155 Harold Frank Hall (Engr I), Phone 805-893-3211, [email protected] Meetings: Tuesdays and Thursdays, 10:00-11:30 AM, Phelps 1431 Consultation: Open office hours, held in Room 5155 Harold Frank Hall (Engr I) – Tuesdays 11:30-1:00, Thursdays 8:30-10:00 Motivation: Dependability concerns are integral parts of engineering design. Ideally, we would like our computer systems to be perfect, always yielding timely and correct results. However, just as bridges collapse and airplanes crash occasionally, so too computer hardware and software cannot be made totally immune to unpredictable behavior. Despite great strides in component reliability and programming methodology, the exponentially increasing complexity of integrated circuits and software systems makes the design of prefect computer systems nearly impossible. In this course, we study the causes of computer system failures (impairments to dependability), techniques for ensuring correct and timely computations despite such impairments, and tools for evaluating the quality of proposed or implemented solutions. Prerequisites: Basic computer architecture at the level of ECE 154. References: Required textbook – None (class handout or reference will be provided before each lecture) Other useful books, not required – Pradhan, D.K. (ed.), Fault-Tolerant Computer System Design, Prentice-Hall, 1996. [out of print, as of 9/2006] Siewiorek, D.P. and R.S. Swarz, Reliable Computer Systems: Design and Evaluation, Digital Press, 2nd ed., 1992. Johnson, B.W., Design and Analysis of Fault-Tolerant Digital Systems, Addison-Wesley, 1989. Lala, P.K., Self-checking and Fault-Tolerant Digital Design, Morgan Kaufmann, 2001. Shooman, M.L., Reliability of Computer Systems and Networks, Wiley, 2002 Journals – IEEE Trans. Dependable and Secure Systems (new, since 2004), IEEE Trans. Computers, IEEE Trans. Reliability, IEEE Trans. Software Engineering, ACM Trans. Computer Systems, and Information Processing Letters. Also, IEEE Computer, IEEE Micro, IEEE Design & Test of Computers, and ACM Computing Surveys are good sources for broad introductory papers. Conferences – Int’l Conf. Dependable Systems and Networks (DSN, annual, since 1971; formerly known as FTCS), Pacific Rim Int’l Symp. Dependable Computing (PRDC, since 1989), IFIP Int’l Working Conf. Dependable Computing for Critical Applications (DCCA; discontinued and merged with FTCS to form DSN), Int'l Symp. Software Reliability Engineering (ISSRE, annual, since 1990). Electronic resources at UCSB – Journals and conference proceedings listed above, as well as many other useful references, can be accessed electronically via: http://www.library.ucsb.edu/eresources/databases/ (electronic journals, collections, etc.) http://www.library.ucsb.edu/subjects/engineering/ece.html (research guide in ECE)
منابع مشابه
Fault Tolerant DNA Computing Based on Digital Microfluidic Biochips
Historically, DNA molecules have been known as the building blocks of life, later on in 1994, Leonard Adelman introduced a technique to utilize DNA molecules for a new kind of computation. According to the massive parallelism, huge storage capacity and the ability of using the DNA molecules inside the living tissue, this type of computation is applied in many application areas such as me...
متن کاملFault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit
Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...
متن کاملFault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit
Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کاملQuantum Error Correction and Fault Tolerant Quantum Computing
e?cient fault-tolerant quantum computing arxiv fault-tolerant quantum computing crcnetbase an introduction to quantum error correction and fault quantum error correction and fault tolerant quantum computing fault tolerance in quantum computation eceu fault-tolerant quantum computation world scientific fault -tolerant quantum computation versus realistic noise quantum error correction and fault-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009